Overview

Dataset statistics

Number of variables12
Number of observations205
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.3 KiB
Average record size in memory96.6 B

Variable types

Categorical1
Numeric11

Alerts

name has a high cardinality: 147 distinct values High cardinality
wheelbase is highly correlated with carlength and 7 other fieldsHigh correlation
carlength is highly correlated with wheelbase and 8 other fieldsHigh correlation
carwidth is highly correlated with wheelbase and 8 other fieldsHigh correlation
curbweight is highly correlated with wheelbase and 9 other fieldsHigh correlation
cylindernumber is highly correlated with curbweight and 5 other fieldsHigh correlation
enginesize is highly correlated with wheelbase and 9 other fieldsHigh correlation
boreratio is highly correlated with wheelbase and 8 other fieldsHigh correlation
horsepower is highly correlated with wheelbase and 9 other fieldsHigh correlation
citympg is highly correlated with carlength and 8 other fieldsHigh correlation
highwaympg is highly correlated with wheelbase and 9 other fieldsHigh correlation
price is highly correlated with wheelbase and 9 other fieldsHigh correlation
wheelbase is highly correlated with carlength and 5 other fieldsHigh correlation
carlength is highly correlated with wheelbase and 8 other fieldsHigh correlation
carwidth is highly correlated with wheelbase and 9 other fieldsHigh correlation
curbweight is highly correlated with wheelbase and 9 other fieldsHigh correlation
cylindernumber is highly correlated with carwidth and 4 other fieldsHigh correlation
enginesize is highly correlated with wheelbase and 9 other fieldsHigh correlation
boreratio is highly correlated with carlength and 7 other fieldsHigh correlation
horsepower is highly correlated with carlength and 8 other fieldsHigh correlation
citympg is highly correlated with carlength and 7 other fieldsHigh correlation
highwaympg is highly correlated with wheelbase and 8 other fieldsHigh correlation
price is highly correlated with wheelbase and 9 other fieldsHigh correlation
wheelbase is highly correlated with carlength and 3 other fieldsHigh correlation
carlength is highly correlated with wheelbase and 6 other fieldsHigh correlation
carwidth is highly correlated with wheelbase and 6 other fieldsHigh correlation
curbweight is highly correlated with wheelbase and 7 other fieldsHigh correlation
enginesize is highly correlated with carlength and 6 other fieldsHigh correlation
horsepower is highly correlated with curbweight and 4 other fieldsHigh correlation
citympg is highly correlated with carlength and 6 other fieldsHigh correlation
highwaympg is highly correlated with carlength and 6 other fieldsHigh correlation
price is highly correlated with wheelbase and 7 other fieldsHigh correlation
wheelbase is highly correlated with carlength and 9 other fieldsHigh correlation
carlength is highly correlated with wheelbase and 9 other fieldsHigh correlation
carwidth is highly correlated with wheelbase and 9 other fieldsHigh correlation
curbweight is highly correlated with wheelbase and 9 other fieldsHigh correlation
cylindernumber is highly correlated with wheelbase and 9 other fieldsHigh correlation
enginesize is highly correlated with wheelbase and 9 other fieldsHigh correlation
boreratio is highly correlated with wheelbase and 9 other fieldsHigh correlation
horsepower is highly correlated with wheelbase and 9 other fieldsHigh correlation
citympg is highly correlated with wheelbase and 9 other fieldsHigh correlation
highwaympg is highly correlated with wheelbase and 9 other fieldsHigh correlation
price is highly correlated with wheelbase and 9 other fieldsHigh correlation
name is uniformly distributed Uniform

Reproduction

Analysis started2022-09-07 17:42:45.983876
Analysis finished2022-09-07 17:43:07.224481
Duration21.24 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct147
Distinct (%)71.7%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
toyota corona
 
6
toyota corolla
 
6
peugeot 504
 
6
subaru dl
 
4
mitsubishi mirage g4
 
3
Other values (142)
180 

Length

Max length31
Median length24
Mean length14.14634146
Min length6

Characters and Unicode

Total characters2900
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)53.2%

Sample

1st rowalfa-romero giulia
2nd rowalfa-romero stelvio
3rd rowalfa-romero Quadrifoglio
4th rowaudi 100 ls
5th rowaudi 100ls

Common Values

ValueCountFrequency (%)
toyota corona6
 
2.9%
toyota corolla6
 
2.9%
peugeot 5046
 
2.9%
subaru dl4
 
2.0%
mitsubishi mirage g43
 
1.5%
mazda 6263
 
1.5%
toyota mark ii3
 
1.5%
mitsubishi outlander3
 
1.5%
mitsubishi g43
 
1.5%
honda civic3
 
1.5%
Other values (137)165
80.5%

Length

2022-09-07T14:43:07.396560image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
toyota31
 
6.4%
nissan18
 
3.7%
mazda15
 
3.1%
mitsubishi13
 
2.7%
honda13
 
2.7%
corolla12
 
2.5%
subaru12
 
2.5%
peugeot11
 
2.3%
volvo11
 
2.3%
sw10
 
2.0%
Other values (167)342
70.1%

Most occurring characters

ValueCountFrequency (%)
285
 
9.8%
a259
 
8.9%
o243
 
8.4%
t167
 
5.8%
e158
 
5.4%
s153
 
5.3%
i147
 
5.1%
l138
 
4.8%
r133
 
4.6%
c126
 
4.3%
Other values (36)1091
37.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2384
82.2%
Space Separator285
 
9.8%
Decimal Number179
 
6.2%
Close Punctuation13
 
0.4%
Dash Punctuation13
 
0.4%
Open Punctuation13
 
0.4%
Uppercase Letter13
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a259
 
10.9%
o243
 
10.2%
t167
 
7.0%
e158
 
6.6%
s153
 
6.4%
i147
 
6.2%
l138
 
5.8%
r133
 
5.6%
c126
 
5.3%
u126
 
5.3%
Other values (15)734
30.8%
Decimal Number
ValueCountFrequency (%)
044
24.6%
437
20.7%
123
12.8%
221
11.7%
518
10.1%
912
 
6.7%
612
 
6.7%
310
 
5.6%
72
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
M4
30.8%
D3
23.1%
U1
 
7.7%
X1
 
7.7%
Q1
 
7.7%
V1
 
7.7%
C1
 
7.7%
N1
 
7.7%
Space Separator
ValueCountFrequency (%)
285
100.0%
Close Punctuation
ValueCountFrequency (%)
)13
100.0%
Dash Punctuation
ValueCountFrequency (%)
-13
100.0%
Open Punctuation
ValueCountFrequency (%)
(13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2397
82.7%
Common503
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a259
 
10.8%
o243
 
10.1%
t167
 
7.0%
e158
 
6.6%
s153
 
6.4%
i147
 
6.1%
l138
 
5.8%
r133
 
5.5%
c126
 
5.3%
u126
 
5.3%
Other values (23)747
31.2%
Common
ValueCountFrequency (%)
285
56.7%
044
 
8.7%
437
 
7.4%
123
 
4.6%
221
 
4.2%
518
 
3.6%
)13
 
2.6%
-13
 
2.6%
(13
 
2.6%
912
 
2.4%
Other values (3)24
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII2900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
285
 
9.8%
a259
 
8.9%
o243
 
8.4%
t167
 
5.8%
e158
 
5.4%
s153
 
5.3%
i147
 
5.1%
l138
 
4.8%
r133
 
4.6%
c126
 
4.3%
Other values (36)1091
37.6%

wheelbase
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct53
Distinct (%)25.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.75658537
Minimum86.6
Maximum120.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:07.572913image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum86.6
5-th percentile93.02
Q194.5
median97
Q3102.4
95-th percentile110
Maximum120.9
Range34.3
Interquartile range (IQR)7.9

Descriptive statistics

Standard deviation6.021775685
Coefficient of variation (CV)0.06097594062
Kurtosis1.017038946
Mean98.75658537
Median Absolute Deviation (MAD)2.7
Skewness1.050213776
Sum20245.1
Variance36.2617824
MonotonicityNot monotonic
2022-09-07T14:43:07.729128image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94.521
 
10.2%
93.720
 
9.8%
95.713
 
6.3%
96.58
 
3.9%
97.37
 
3.4%
98.47
 
3.4%
104.36
 
2.9%
100.46
 
2.9%
107.96
 
2.9%
98.86
 
2.9%
Other values (43)105
51.2%
ValueCountFrequency (%)
86.62
 
1.0%
88.41
 
0.5%
88.62
 
1.0%
89.53
 
1.5%
91.32
 
1.0%
931
 
0.5%
93.15
 
2.4%
93.31
 
0.5%
93.720
9.8%
94.31
 
0.5%
ValueCountFrequency (%)
120.91
 
0.5%
115.62
 
1.0%
114.24
2.0%
1132
 
1.0%
1121
 
0.5%
1103
1.5%
109.15
2.4%
1081
 
0.5%
107.96
2.9%
106.71
 
0.5%

carlength
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct75
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean174.0492683
Minimum141.1
Maximum208.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:07.885410image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum141.1
5-th percentile157.14
Q1166.3
median173.2
Q3183.1
95-th percentile196.36
Maximum208.1
Range67
Interquartile range (IQR)16.8

Descriptive statistics

Standard deviation12.33728853
Coefficient of variation (CV)0.0708838862
Kurtosis-0.08289485345
Mean174.0492683
Median Absolute Deviation (MAD)6.9
Skewness0.1559537713
Sum35680.1
Variance152.2086882
MonotonicityNot monotonic
2022-09-07T14:43:08.072917image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
157.315
 
7.3%
188.811
 
5.4%
171.77
 
3.4%
186.77
 
3.4%
166.37
 
3.4%
165.36
 
2.9%
177.86
 
2.9%
176.26
 
2.9%
186.66
 
2.9%
1725
 
2.4%
Other values (65)129
62.9%
ValueCountFrequency (%)
141.11
 
0.5%
144.62
 
1.0%
1503
 
1.5%
155.93
 
1.5%
156.91
 
0.5%
157.11
 
0.5%
157.315
7.3%
157.91
 
0.5%
158.73
 
1.5%
158.81
 
0.5%
ValueCountFrequency (%)
208.11
 
0.5%
202.62
1.0%
199.62
1.0%
199.21
 
0.5%
198.94
2.0%
1971
 
0.5%
193.81
 
0.5%
192.73
1.5%
191.71
 
0.5%
190.92
1.0%

carwidth
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct44
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.90780488
Minimum60.3
Maximum72.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:08.229160image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum60.3
5-th percentile63.6
Q164.1
median65.5
Q366.9
95-th percentile70.46
Maximum72.3
Range12
Interquartile range (IQR)2.8

Descriptive statistics

Standard deviation2.145203853
Coefficient of variation (CV)0.03254855562
Kurtosis0.7027642441
Mean65.90780488
Median Absolute Deviation (MAD)1.4
Skewness0.9040034988
Sum13511.1
Variance4.60189957
MonotonicityNot monotonic
2022-09-07T14:43:08.557292image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
63.824
 
11.7%
66.523
 
11.2%
65.415
 
7.3%
63.611
 
5.4%
64.410
 
4.9%
68.410
 
4.9%
649
 
4.4%
65.58
 
3.9%
65.27
 
3.4%
64.26
 
2.9%
Other values (34)82
40.0%
ValueCountFrequency (%)
60.31
 
0.5%
61.81
 
0.5%
62.51
 
0.5%
63.41
 
0.5%
63.611
5.4%
63.824
11.7%
63.93
 
1.5%
649
 
4.4%
64.12
 
1.0%
64.26
 
2.9%
ValueCountFrequency (%)
72.31
 
0.5%
721
 
0.5%
71.73
1.5%
71.43
1.5%
70.91
 
0.5%
70.61
 
0.5%
70.51
 
0.5%
70.33
1.5%
69.62
1.0%
68.94
2.0%

curbweight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct171
Distinct (%)83.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2555.565854
Minimum1488
Maximum4066
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:08.712887image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1488
5-th percentile1901
Q12145
median2414
Q32935
95-th percentile3503
Maximum4066
Range2578
Interquartile range (IQR)790

Descriptive statistics

Standard deviation520.6802035
Coefficient of variation (CV)0.2037436064
Kurtosis-0.0428537661
Mean2555.565854
Median Absolute Deviation (MAD)386
Skewness0.6813981891
Sum523891
Variance271107.8743
MonotonicityNot monotonic
2022-09-07T14:43:08.890617image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23854
 
2.0%
19183
 
1.5%
22753
 
1.5%
19893
 
1.5%
24102
 
1.0%
21912
 
1.0%
25352
 
1.0%
20242
 
1.0%
24142
 
1.0%
40662
 
1.0%
Other values (161)180
87.8%
ValueCountFrequency (%)
14881
0.5%
17131
0.5%
18191
0.5%
18371
0.5%
18742
1.0%
18762
1.0%
18891
0.5%
18901
0.5%
19001
0.5%
19051
0.5%
ValueCountFrequency (%)
40662
1.0%
39501
0.5%
39001
0.5%
37701
0.5%
37501
0.5%
37401
0.5%
37151
0.5%
36851
0.5%
35151
0.5%
35051
0.5%

cylindernumber
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.380487805
Minimum2
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:09.025766image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q14
median4
Q34
95-th percentile6
Maximum12
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.080853764
Coefficient of variation (CV)0.2467427857
Kurtosis13.71486634
Mean4.380487805
Median Absolute Deviation (MAD)0
Skewness2.817459025
Sum898
Variance1.168244859
MonotonicityNot monotonic
2022-09-07T14:43:09.150771image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
4159
77.6%
624
 
11.7%
511
 
5.4%
85
 
2.4%
24
 
2.0%
31
 
0.5%
121
 
0.5%
ValueCountFrequency (%)
24
 
2.0%
31
 
0.5%
4159
77.6%
511
 
5.4%
624
 
11.7%
85
 
2.4%
121
 
0.5%
ValueCountFrequency (%)
121
 
0.5%
85
 
2.4%
624
 
11.7%
511
 
5.4%
4159
77.6%
31
 
0.5%
24
 
2.0%

enginesize
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct44
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.9073171
Minimum61
Maximum326
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:09.295958image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum61
5-th percentile90
Q197
median120
Q3141
95-th percentile201.2
Maximum326
Range265
Interquartile range (IQR)44

Descriptive statistics

Standard deviation41.64269344
Coefficient of variation (CV)0.3281346923
Kurtosis5.305682092
Mean126.9073171
Median Absolute Deviation (MAD)23
Skewness1.947655045
Sum26016
Variance1734.113917
MonotonicityNot monotonic
2022-09-07T14:43:09.512387image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
12215
 
7.3%
9215
 
7.3%
9714
 
6.8%
9814
 
6.8%
10813
 
6.3%
9012
 
5.9%
11012
 
5.9%
1098
 
3.9%
1207
 
3.4%
1417
 
3.4%
Other values (34)88
42.9%
ValueCountFrequency (%)
611
 
0.5%
703
 
1.5%
791
 
0.5%
801
 
0.5%
9012
5.9%
915
 
2.4%
9215
7.3%
9714
6.8%
9814
6.8%
1031
 
0.5%
ValueCountFrequency (%)
3261
 
0.5%
3081
 
0.5%
3041
 
0.5%
2582
 
1.0%
2342
 
1.0%
2093
1.5%
2031
 
0.5%
1943
1.5%
1834
2.0%
1816
2.9%

boreratio
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct38
Distinct (%)18.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.329756098
Minimum2.54
Maximum3.94
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:10.211843image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2.54
5-th percentile2.97
Q13.15
median3.31
Q33.58
95-th percentile3.78
Maximum3.94
Range1.4
Interquartile range (IQR)0.43

Descriptive statistics

Standard deviation0.2708437054
Coefficient of variation (CV)0.08134040377
Kurtosis-0.7850418332
Mean3.329756098
Median Absolute Deviation (MAD)0.26
Skewness0.0201564181
Sum682.6
Variance0.07335631277
MonotonicityNot monotonic
2022-09-07T14:43:10.446184image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
3.6223
 
11.2%
3.1920
 
9.8%
3.1515
 
7.3%
3.0312
 
5.9%
2.9712
 
5.9%
3.469
 
4.4%
3.318
 
3.9%
3.438
 
3.9%
3.788
 
3.9%
3.277
 
3.4%
Other values (28)83
40.5%
ValueCountFrequency (%)
2.541
 
0.5%
2.681
 
0.5%
2.917
3.4%
2.921
 
0.5%
2.9712
5.9%
2.991
 
0.5%
3.015
2.4%
3.0312
5.9%
3.056
2.9%
3.081
 
0.5%
ValueCountFrequency (%)
3.942
 
1.0%
3.82
 
1.0%
3.788
 
3.9%
3.761
 
0.5%
3.743
 
1.5%
3.75
 
2.4%
3.632
 
1.0%
3.6223
11.2%
3.611
 
0.5%
3.61
 
0.5%

horsepower
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct59
Distinct (%)28.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.1170732
Minimum48
Maximum288
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:10.680557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile62
Q170
median95
Q3116
95-th percentile180.8
Maximum288
Range240
Interquartile range (IQR)46

Descriptive statistics

Standard deviation39.54416681
Coefficient of variation (CV)0.3798048255
Kurtosis2.68400616
Mean104.1170732
Median Absolute Deviation (MAD)25
Skewness1.405310154
Sum21344
Variance1563.741129
MonotonicityNot monotonic
2022-09-07T14:43:10.852471image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6819
 
9.3%
7011
 
5.4%
6910
 
4.9%
1169
 
4.4%
1108
 
3.9%
957
 
3.4%
1146
 
2.9%
1606
 
2.9%
1016
 
2.9%
626
 
2.9%
Other values (49)117
57.1%
ValueCountFrequency (%)
481
 
0.5%
522
 
1.0%
551
 
0.5%
562
 
1.0%
581
 
0.5%
601
 
0.5%
626
 
2.9%
641
 
0.5%
6819
9.3%
6910
4.9%
ValueCountFrequency (%)
2881
 
0.5%
2621
 
0.5%
2073
1.5%
2001
 
0.5%
1842
1.0%
1823
1.5%
1762
1.0%
1751
 
0.5%
1622
1.0%
1612
1.0%

citympg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct29
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.2195122
Minimum13
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:10.993096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile16
Q119
median24
Q330
95-th percentile37
Maximum49
Range36
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.542141653
Coefficient of variation (CV)0.2594079379
Kurtosis0.5786483405
Mean25.2195122
Median Absolute Deviation (MAD)5
Skewness0.6637040288
Sum5170
Variance42.79961741
MonotonicityNot monotonic
2022-09-07T14:43:11.149352image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
3128
13.7%
1927
13.2%
2422
10.7%
2714
 
6.8%
1713
 
6.3%
2612
 
5.9%
2312
 
5.9%
218
 
3.9%
258
 
3.9%
308
 
3.9%
Other values (19)53
25.9%
ValueCountFrequency (%)
131
 
0.5%
142
 
1.0%
153
 
1.5%
166
 
2.9%
1713
6.3%
183
 
1.5%
1927
13.2%
203
 
1.5%
218
 
3.9%
224
 
2.0%
ValueCountFrequency (%)
491
 
0.5%
471
 
0.5%
451
 
0.5%
387
3.4%
376
2.9%
361
 
0.5%
351
 
0.5%
341
 
0.5%
331
 
0.5%
321
 
0.5%

highwaympg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)14.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.75121951
Minimum16
Maximum54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:11.289971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile22
Q125
median30
Q334
95-th percentile42.8
Maximum54
Range38
Interquartile range (IQR)9

Descriptive statistics

Standard deviation6.886443131
Coefficient of variation (CV)0.2239404889
Kurtosis0.4400703815
Mean30.75121951
Median Absolute Deviation (MAD)5
Skewness0.5399971879
Sum6304
Variance47.423099
MonotonicityNot monotonic
2022-09-07T14:43:11.430597image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
2519
 
9.3%
3817
 
8.3%
2417
 
8.3%
3016
 
7.8%
3216
 
7.8%
3414
 
6.8%
3713
 
6.3%
2813
 
6.3%
2910
 
4.9%
339
 
4.4%
Other values (20)61
29.8%
ValueCountFrequency (%)
162
 
1.0%
171
 
0.5%
182
 
1.0%
192
 
1.0%
202
 
1.0%
228
3.9%
237
 
3.4%
2417
8.3%
2519
9.3%
263
 
1.5%
ValueCountFrequency (%)
541
 
0.5%
531
 
0.5%
501
 
0.5%
472
 
1.0%
462
 
1.0%
434
 
2.0%
423
 
1.5%
413
 
1.5%
392
 
1.0%
3817
8.3%

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct189
Distinct (%)92.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13276.71057
Minimum5118
Maximum45400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-09-07T14:43:11.586945image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum5118
5-th percentile6197
Q17788
median10295
Q316503
95-th percentile32472.4
Maximum45400
Range40282
Interquartile range (IQR)8715

Descriptive statistics

Standard deviation7988.852332
Coefficient of variation (CV)0.6017192504
Kurtosis3.051647871
Mean13276.71057
Median Absolute Deviation (MAD)3306
Skewness1.777678156
Sum2721725.667
Variance63821761.58
MonotonicityNot monotonic
2022-09-07T14:43:11.758858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
89212
 
1.0%
92792
 
1.0%
78982
 
1.0%
8916.52
 
1.0%
77752
 
1.0%
88452
 
1.0%
72952
 
1.0%
76092
 
1.0%
66922
 
1.0%
62292
 
1.0%
Other values (179)185
90.2%
ValueCountFrequency (%)
51181
0.5%
51511
0.5%
51951
0.5%
53481
0.5%
53891
0.5%
53991
0.5%
54991
0.5%
55722
1.0%
60951
0.5%
61891
0.5%
ValueCountFrequency (%)
454001
0.5%
413151
0.5%
409601
0.5%
370281
0.5%
368801
0.5%
360001
0.5%
355501
0.5%
350561
0.5%
341841
0.5%
340281
0.5%

Interactions

2022-09-07T14:43:05.284194image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:49.557317image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.148422image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.684011image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.257616image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.768818image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.298422image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.897557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.530384image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:02.050043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.738807image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:05.434174image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:49.718446image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.287892image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.817183image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.395854image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.910350image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.434548image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.036824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.666879image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:02.198341image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.885095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:05.584211image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:49.850736image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.429545image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.950907image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.536041image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.048658image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.573319image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.189763image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.800156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:02.366963image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.016841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:05.733923image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:49.984731image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.567561image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.083959image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.677033image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.183650image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.812059image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.335678image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.933641image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:02.500068image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.166413image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:05.884659image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:50.118408image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.701546image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.220978image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.809803image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.315100image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.946415image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.501636image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.066743image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:02.638881image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.300886image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:06.024977image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:50.247647image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.868593image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.350630image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.947479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.450845image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.067856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.633542image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.200113image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:02.894660image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.433387image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:06.165878image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:50.381699image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.002334image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.484757image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.084218image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.584588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.200216image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.785043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.349564image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.037060image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.566737image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:06.299937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:50.520608image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.134051image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.617265image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.217191image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.717106image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.348869image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:59.938906image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.496154image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.178096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.718225image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:06.433280image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:50.717061image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.267373image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.832994image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.351373image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:56.851260image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.480209image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.085016image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.642190image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.311615image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.855130image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:06.586001image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:50.865952image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.400752image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:53.967194image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.483871image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.018021image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.619465image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.233571image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.779274image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.451503image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:04.997925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:06.734943image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:51.005125image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:52.548575image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:54.116662image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:55.627661image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:57.157045image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:42:58.758240image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:00.384992image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:01.917079image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:03.595839image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-09-07T14:43:05.142871image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2022-09-07T14:43:11.946361image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-07T14:43:12.133825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-07T14:43:12.321363image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-07T14:43:12.497691image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-07T14:43:06.944572image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-07T14:43:07.151057image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

namewheelbasecarlengthcarwidthcurbweightcylindernumberenginesizeboreratiohorsepowercitympghighwaympgprice
0alfa-romero giulia88.6168.864.1254841303.47111212713495.000
1alfa-romero stelvio88.6168.864.1254841303.47111212716500.000
2alfa-romero Quadrifoglio94.5171.265.5282361522.68154192616500.000
3audi 100 ls99.8176.666.2233741093.19102243013950.000
4audi 100ls99.4176.666.4282451363.19115182217450.000
5audi fox99.8177.366.3250751363.19110192515250.000
6audi 100ls105.8192.771.4284451363.19110192517710.000
7audi 5000105.8192.771.4295451363.19110192518920.000
8audi 4000105.8192.771.4308651313.13140172023875.000
9audi 5000s (diesel)99.5178.267.9305351313.13160162217859.167

Last rows

namewheelbasecarlengthcarwidthcurbweightcylindernumberenginesizeboreratiohorsepowercitympghighwaympgprice
195volvo 144ea104.3188.867.2303441413.78114232813415.0
196volvo 244dl104.3188.867.2293541413.78114242815985.0
197volvo 245104.3188.867.2304241413.78114242816515.0
198volvo 264gl104.3188.867.2304541303.62162172218420.0
199volvo diesel104.3188.867.2315741303.62162172218950.0
200volvo 145e (sw)109.1188.868.9295241413.78114232816845.0
201volvo 144ea109.1188.868.8304941413.78160192519045.0
202volvo 244dl109.1188.868.9301261733.58134182321485.0
203volvo 246109.1188.868.9321761453.01106262722470.0
204volvo 264gl109.1188.868.9306241413.78114192522625.0